27 research outputs found

    Standardization of a Communication Middleware for High-Performance Real-Time Systems

    Get PDF
    The last several years saw an emergence of standardization activities for real-time systems including standardization of operating systems (series of POSIX standards [1]), of communication for distributed (POSIX.21 [10]) and parallel systems (MPI/RT [5]) and real-time object management (realtime CORBA [9]). This article describes the ongoing standardization work and implementation of communication middleware for high performance real-time computing. The real-time message passing interface (MPI/RT) advances the non-real-time high-performance communication standard Message Passing Interface Standard (MPI), emphasizing changes that enable and support real-time communication, and is targeted for embedded, fault-tolerant and other real-time systems. MPI/RT is the only communication middleware layer that provides guaranteed quality of service and timeliness for data transfers, is also targeted for real-time CORBA to replace RPC layer and for real-time and embedded JAVAs

    Dynamic Management Of Heterogeneous Resources

    Get PDF
    This paper presents techniques for dynamic load balancing in heterogeneous computing environments. That is, the techniques are designed for sets of machines with varying processing capabilities and memory capacities. These methods can also be applied to homogenous systems in which the effective compute speed or memory availability is reduced by the presence of other programs running outside the target computation. To handle heterogeneous systems, a precise distinction is made between an abstract quantity of work, which might be measured as the number of iterations of a loop or the count of some data structure, and the utilization of resources, measured in seconds of processor time or bytes of memory, required by that work. Once that distinction is clearly drawn, the modifications to existing load balancing techniques are fairly straight-forward. The effectiveness of the resulting load balancing system is demonstrated for a large-scale particle simulation on a network of heterogeneous PC’s, workstations and multiprocessor servers

    Building a High-Performance Collective Communication Library

    Get PDF
    We report on a project to develop a unified approach for building a library of collective communication operations that performs well on a cross-section of problems encountered in real applications. The target architecture is a two-dimensional mesh with worm-hole routing, but the techniques are more general. The approach differs from traditional library implementations in that we address the need for implementations that perform well for various sized vectors and grid dimensions, including non-power-of-two grids. We show how a general approach to hybrid algorithms yields performance across the entire range of vector lengths. Moreover, many scalable implementations of application libraries require collective communication within groups of nodes. Our approach yields the same kind of performance for group collective communication. Results from the Intel Paragon system are included

    Building a high-performance collective communication library

    Get PDF

    Dynamic load balancing and granularity control on heterogeneous and hybrid architectures

    Get PDF
    The past several years have seen concurrent applications grow increasingly complex, as the most advanced techniques from academia find their way into production parallel applications. Moreover, the platforms on which these concurrent computations now execute are frequently heterogeneous networks of workstations and shared-memory multiprocessors, because of their low cost relative to traditional large-scale multicomputers. The combination of sophisticated algorithms and more complex computing environments has made existing load balancing techniques obsolete. Current methods characterize the loads of tasks in very simple terms, often fail to account for the communication costs of an application, and typically consider computational resources to be homogeneous. The complexity of current applications coupled with the fact that they are running in heterogeneous environments has also made partitioning a problem for concurrent execution an ordeal. It is no longer adequate to simply divide the problem into some number of pieces per computer and hope for the best. In a complex application, the workloads of the pieces, which may be equal initially, may diverge over time. On a heterogeneous network, the varying capabilities of the computers will widen this disparity in resource usage even further. Thus, there is a need to dynamically manage the granularity of an application, repartitioning the problem at runtime to correct inadequacies in the original partitioning and to make more effective use of computational resources. This thesis presents techniques for dynamic load balancing in complex irregular applications. Advances over previous work are three-fold: First, these techniques are applicable to networks comprised of heterogeneous machines, including both single- processor workstations and personal computers, and multiprocessor compute servers. Second, the use of load vectors more accurately characterizes the resource requirements of tasks, including the computational demands of different algorithmic phases as well as the needs for other resources, such as memory. Finally, runtime repartitioning adjusts the granularity of the problem so that the available resources are more fully utilized. Two other improvements over earlier techniques include improved algorithms for determining the ideal redistribution of work as well as advanced techniques for selecting which tasks to transfer to satisfy those ideals. The latter algorithms incorporate the notion of task migration costs, including the impact on an application's communications locality. The improvements listed above are demonstrated on both industrial applications and small parametric problems on networks of heterogeneous computers as well as traditional large-scale multicomputers

    A Practical Approach to Dynamic Load Balancing

    No full text
    This paper presents a cohesive, practical load balancing framework that improves upon existing strategies. These techniques are portable to a broad range of prevalent architectures, including massively parallel machines such as the Cray T3D/E and Intel Paragon, shared memory systems such as the SGI PowerChallenge, and networks of workstations. As part of the work, an adaptive heat diffusion scheme is presented as well as a task selection mechanism that can preserve or improve communication locality. Unlike many previous efforts in this arena, the techniques have been applied to two large-scale industrial applications on a variety of multicomputers. In the process, this work exposes a serious deficiency in current load balancing strategies, motivating further work in this area. Keywords: Dynamic load balancing, diffusion, massively parallel computing, irregular problems 1 Introduction A number of trends in computational science and engineering have increased the need for effective dyn..

    Practical Dynamic Load Balancing for Irregular Problems

    No full text
    . In this paper, we present a cohesive, practical load balancing framework that addresses many shortcomings of existing strategies. These techniques are portable to a broad range of prevalent architectures, including massively parallel machines such as the Cray T3D and Intel Paragon, shared memory systems such as the SGI Power Challenge, and networks of workstations. This scheme improves on earlier work in this area and can be analyzed using well-understood techniques. The algorithm operates using nearest-neighbor communication and inherently maintains existing locality in the application. A simple software interface allows the programmer to use load balancing with very little effort. Unlike many previous efforts in this arena, the techniques have been applied to large-scale industrial applications, one of which is described herein. 1 Introduction A number of trends in computational science and engineering have increased the need for effective dynamic load balancing techniques. In par..

    A Load Balancing Technique for Multiphase Computations

    Get PDF
    Parallel computations comprised of multiple, tightly interwoven phases of computation may require a different approach to dynamic load balancing than single-phase computations. This paper presents a load sharing method based on the view of load as a vector, rather than as a scalar. This approach allows multiphase computations to achieve higher efficiency on large-scale multicomputers than possible with traditional techniques. Results are presented for two large-scale particle simulations running on 128 nodes of an Intel Paragon and on 256 processors of a Cray T3D, respectively. INTRODUCTION Load balancing techniques already in the literature have concentrated entirely on single-phase computations (Boillat 1990; Cybenko 1989; Evans and Butt 1993; Heirich and Taylor 1995; Horton 1993; Kohring 1995; Lin and Keller 1987; Muniz and Zaluska 1995; Song 1994; Walshaw and Berzins 1995; Watts et al. 1996; Willebeek-LeMair and Reeves 1993; Williams 1991; Xu and Lau 1997). That is, they work only ..
    corecore